制作对抗性攻击的大多数方法都集中在具有单个主体对象的场景上(例如,来自Imagenet的图像)。另一方面,自然场景包括多个在语义上相关的主要对象。因此,探索设计攻击策略至关重要,这些攻击策略超出了在单对象场景上学习或攻击单对象受害者分类器。由于其固有的属性将扰动向未知模型的强大可传递性强,因此本文介绍了使用生成模型对多对象场景的对抗性攻击的第一种方法。为了代表输入场景中不同对象之间的关系,我们利用开源的预训练的视觉语言模型剪辑(对比语言图像 - 预训练),并动机利用语言中的编码语义来利用编码的语义空间与视觉空间一起。我们称这种攻击方法生成对抗性多对象场景攻击(GAMA)。 GAMA展示了剪辑模型作为攻击者的工具的实用性,以训练可强大的扰动发电机为多对象场景。使用联合图像文本功能来训练发电机,我们表明GAMA可以在各种攻击环境中制作有效的可转移扰动,以欺骗受害者分类器。例如,GAMA触发的错误分类比在黑框设置中的最新生成方法高出约16%,在黑框设置中,分类器体系结构和攻击者的数据分布都与受害者不同。我们的代码将很快公开提供。
translated by 谷歌翻译
当前文献中可用的卷积神经网络(CNN)方法旨在主要与低分辨率图像合作。当应用于非常大的图像时,与GPU记忆相关的挑战,比语义通信所需的较小的接受场以及需要结合多尺度特征的需求。但是,可以减少输入图像的分辨率,但要大量关键信息丢失。基于概述的问题,我们引入了一个新的研究问题,以培训CNN模型为非常大的图像,并介绍“超级数据集”,这是一个简单而代表性的基准数据集,用于此任务。 Ultramnist是使用流行的MNIST数字设计的,并添加了更多的复杂性,以很好地复制现实世界问题的挑战。我们提出了两个问题的两个变体:“超级分类”和“预算意识到的超级名人分类”。标准的超快分类基准旨在促进新型CNN培训方法的开发,从而有效利用最佳可用GPU资源。预算感知的变体旨在促进在受限GPU记忆下工作的方法的开发。为了开发竞争解决方案,我们为标准基准及其预算感知变体提供了几种基线模型。我们研究了减少分辨率对涉及流行最先进模型中预审预定型骨架的基线模型的性能的影响和目前的结果。最后,借助提出的基准数据集和基线,我们希望为新一代的CNN方法铺平地面,适合以有效和资源的方式处理大型图像。
translated by 谷歌翻译
图像增强方法通常假定噪声是无关的,并且将降解模型近似为零均值的加性高斯。但是,这种假设不适合生物医学成像系统,在生物医学成像系统中,基于传感器的噪声源与信号强度成正比,并且噪声更好地表示为泊松过程。在这项工作中,我们探讨了一种基于词典学习的方法,并提出了一种新颖的自我监督学习方法,用于单像denoising,其中噪声近似为泊松过程,不需要干净的地面真实数据。具体而言,我们近似于通过反复的神经网络进行图像降级的传统迭代优化算法,该神经网络可实现相对于网络的权重的稀疏性。由于稀疏表示形式基于基础图像,因此它能够抑制图像贴片中的虚假组件(噪声),从而引入隐式正则化,以通过网络结构来降级任务。在两个生物成像数据集上的实验表明,我们的方法在PSNR和SSIM方面优于最先进的方法。我们的定性结果表明,除了在标准定量指标上进行更高的性能外,我们还能够比其他比较方法恢复更多的细节。我们的代码可在https://github.com/tacalvin/poisson2sparse上公开提供。
translated by 谷歌翻译
Systemic Lupus红斑(SLU)是一种自身免疫性疾病,其中患者的免疫系统开始攻击身体的健康组织。狼疮肾炎(LN)是指由于这些攻击而导致肾脏组织的炎症导致肾功能衰竭。国际肾病学会/肾病学会(ISN / RPS)已释放了基于在SLE肾损伤期间观察到的各种模式的分类系统。传统方法需要对肾活检的细致病理学评估,并且是耗时的。最近,计算技术有助于通过使用虚拟显微镜或整个幻灯片成像(WSI)来缓解该问题。随着深度学习和现代计算机视觉技术的使用,我们提出了一种能够自动化的流水线,其能够使用提取的肾小球特征检测这些整个幻灯片图像中的各种幻灯片图案的过程和2)。
translated by 谷歌翻译
Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.
translated by 谷歌翻译
The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component analysis and functional autoencoders, which are limited to linear mappings or scalar representations for the time series, which is inefficient. In real data applications, the nature of the data is much more complex. We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder, that uses continuous hidden layers consisting of continuous neurons to learn the structure inherent in functional data, which addresses the aforementioned concerns in the existing approaches. Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed. The effectiveness of the proposed model is demonstrated through multiple simulations and real data examples.
translated by 谷歌翻译
Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.
translated by 谷歌翻译
Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity
translated by 谷歌翻译
The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents. Most methods devised to address this problem depend heavily on well-defined task boundaries, and thus depend on human supervision. Our task-agnostic method, Self-Activating Neural Ensembles (SANE), uses a modular architecture designed to avoid catastrophic forgetting without making any such assumptions. At the beginning of each trajectory, a module in the SANE ensemble is activated to determine the agent's next policy. During training, new modules are created as needed and only activated modules are updated to ensure that unused modules remain unchanged. This system enables our method to retain and leverage old skills, while growing and learning new ones. We demonstrate our approach on visually rich procedurally generated environments.
translated by 谷歌翻译
We present a novel hybrid learning method, HyLEAR, for solving the collision-free navigation problem for self-driving cars in POMDPs. HyLEAR leverages interposed learning to embed knowledge of a hybrid planner into a deep reinforcement learner to faster determine safe and comfortable driving policies. In particular, the hybrid planner combines pedestrian path prediction and risk-aware path planning with driving-behavior rule-based reasoning such that the driving policies also take into account, whenever possible, the ride comfort and a given set of driving-behavior rules. Our experimental performance analysis over the CARLA-CTS1 benchmark of critical traffic scenarios revealed that HyLEAR can significantly outperform the selected baselines in terms of safety and ride comfort.
translated by 谷歌翻译